Comparing Weighting Models for Monolingual Information Retrieval

نویسندگان

  • Gianni Amati
  • Claudio Carpineto
  • Giovanni Romano
چکیده

Motivated by the hypothesis that the retrieval performance of a weighting model is independent of the language in which queries and collection are expressed, we compared the retrieval performance of three weighting models, i.e., Okapi, statistical language modeling (SLM), and deviation from randomness (DFR), on three monolingual test collections, i.e., French, Italian, and Spanish. The DFR model was found to consistently achieve better results than both Okapi and SLM, whose performance was comparable. We also evaluated whether the use of retrieval feedback improved retrieval performance; retrieval feedback was beneficial for DFR and Okapi and detrimental for SLM. Besides relative performance, DFR with retrieval feedback achieved excellent absolute results: best run for Italian and Spanish, third run for French.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

EXETER at CLEF 2002: Experiments with Machine Translation for Monolingual and Bilingual Retrieval

This year, the University of Exeter participated in both the CLEF 2002 monolingual and bilingual task for two languages: Italian and Spanish. We submitted 4 ranked results each for both Italian and Spanish Monolingual tasks and 5 each for the bilingual tasks. We report experimental results from our investigations of merging topic translations from two machine translation (MT) systems and recent...

متن کامل

ITC-irst at CLEF 2000: Italian Monolingual Track

This paper presents work on document retrieval for Italian carried out at ITC-irst. Two different approaches to information retrieval were investigated, one based on the Okapi weighting formula and one based on a statistical model. Development experiments were carried out using the Italian sample of the TREC-8 CLIR track. Performance evaluation was done on the Cross Language Evaluation Forum (C...

متن کامل

Italian Text Retrieval for CLEF 2000 at ITC-irst

This paper presents work on document retrieval for Italian carried out at ITC-irst. Two different approaches to information retrieval were investigated, one based on the Okapi weighting formula and one based on a statistical model. Development experiments were carried out using the Italian sample of the TREC-8 CLIR track. Performance evaluation was done on the Cross Language Evaluation Forum (C...

متن کامل

Translation Term Weighting and Combining Translation Resources in Cross-Language Retrieval

In TREC-10 the Berkeley group participated only in the English-Arabic cross-language retrieval (CLIR) track. One Arabic monolingual run and four English-Arabic cross-language runs were submitted. Our approach to the cross-language retrieval was to translate the English topics into Arabic using online EnglishArabic bilingual dictionaries and machine translation software. The five official runs a...

متن کامل

University of Hagen at GeoCLEF 2008: Combining IR and QA for Geographic Information Retrieval

This paper describes the participation of GIRSA at GeoCLEF 2008, the geographic information retrieval task at CLEF. GIRSA is a modified and improved variant of the system which participated at GeoCLEF 2007. It combines results retrieved with methods from information retrieval (IR) on geographically annotated data and question answering (QA) employing query decomposition. For the monolingual Ger...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003